Identifying more bloggers: Towards large scale personality classification of personal weblogs
نویسندگان
چکیده
We report new results on the relatively novel task of automatic classification of blog author personality. Promisingly high classification accuracies have recently been reported for four important personality traits (Extraversion, Neuroticism, Agreeableness and Conscientiousness). But the blog corpus used in that work required careful preparation, and was consequently quite small (with less than a hundred authors; and less than half a million words). Here, we provide an initial report on the classification accuracies that can be achieved when classifiers conditioned on the small corpus are applied to a larger, automatically-acquired blog corpus, using lowergranularity personality data and substantially less manual preparation (with over a thousand bloggers, and approximately five million words). Predictably, results on the larger corpus are not as impressive as those on the smaller; nevertheless, they point the way forward for further work.
منابع مشابه
What Are They Blogging About? Personality, Topic and Motivation in Blogs
Personal weblogs (blogs), provide individuals with the opportunity to write freely and express themselves online in the presence of others. In such situations, what do bloggers write about, and what are their motivations for blogging? Using a large blog corpus annotated with the LIWC text analysis program, we examine the content of blogs to provide insight into the role of personality in motiva...
متن کاملThe Identity of Bloggers: Openness and Gender in Personal Weblogs
Work has recently been completed on a PhD Thesis concerning individual difference in the language of personal weblogs (Nowson 2005). This paper highlights some of the results. Blogs are increasingly used as a resource for academic study, as evidenced by this symposium. Bloggers are not, however, representative of the population as a whole: they are more likely to be teenage or 20-something fema...
متن کاملIdentifying Bloggers' Residential Areas
This paper proposes a method to infer bloggers’ residential areas. Identifying bloggers’ residential areas will be useful as another axis to retrieve weblogs or for tasks that resolve ambiguous objects in terms of geographic contexts. Our method focuses on the local context of geographic location terms and uses binary classifiers to decide whether the context is indicating the writer’s resident...
متن کاملA Study on the Perception of Students towards Educational Weblogs
Weblogs are a popular form of easy-to-use personal publishing that has attracted millions of bloggers to share their personal thoughts, opinions, and knowledge on the web. The versatility of weblogs as a communication medium has attracted interests from educators. Educational applications of weblogs have so far included journals, e-portfolio, learning diaries, and logbooks. As in the case of ot...
متن کاملCapturing Global Mood Levels using Blog Posts
The personal, diary-like nature of blogs prompts many bloggers to indicate their mood at the time of posting. Aggregating these indications over a large amount of bloggers gives a “blogosphere state-of-mind” for each point in time: the intensity of different moods among bloggers at that time. In this paper, we address the task of estimating this state-of-mind from the text written by bloggers. ...
متن کامل